Hybrid open-loop/closed-loop compression of pictures

ABSTRACT

In a method of video coding, in which a difference is formed between input picture values and picture prediction values and that difference is transforming with a DCT, the picture prediction is formed as: P=(1−c)P c +CP O  where P C  is a closed loop predictor which is restricted to prediction values capable of exact reconstruction in a downstream decoder and P O  is a spatial predictor which is not restricted to prediction values capable of exact reconstruction. The factor can vary from zero to unity depending on a variety of parameters.

RELATED APPLICATIONS

The patent application is a divisional application of co-pending U.S.patent application Ser. No. 13/383,229, filed Jan. 10, 2012 which was anational stage filing under 35 U.S.C. 371 of International ApplicationNo. PCT/EP2010/059945, filed Jul. 9, 2010, which claims priority toInternational Application No. PCT/EP2009/058879, filed Jul. 10, 2009 andInternational Application No. PCT/EP2009/058880, filed Jul. 10, 2009,the content of each are hereby incorporated by reference.

BACKGROUND

This invention relates to compression coding of pictures and especiallyto the coding of pictures in video sequences. The term picture is usedhere to include fields and frames.

An early idea in the compression of pictures, known as DifferentialPulse Code Modulation (DPCM), was to transmit not a pixel value but thedifference between that pixel value and a predicted value for thatpixel. This differential approach can exploit spatial redundancy in apicture and can exploit both spatial and temporal redundancies in avideo sequence of pictures.

As video compression techniques developed towards the well known MPEGcompression schemes, attention focussed on the use of differentialtechniques in the temporal domain. With accurate motion measurementtechniques defining motion vectors between blocks in successivepictures, inter-picture differences can be very small and coded highlyefficiently. To exploit spatial redundancy, spatial transform techniqueswere preferred, and applied both to motion-predicted (inter-coded) andunpredicted (intra-coded) areas of the picture.

A well recognised video encoder thus included motion compensatedprediction, DCT or other spatial transform, quantisation and variablelength or other entropy coding.

Efforts have continued in MPEG and in other coding regimes to increasecoding efficiency and to extend coding capability to HDTV and stillhigher picture resolutions.

One technique included in MPEG-4 Part 10/AVC/H.264 is to supplement thespatial transform with intra-picture prediction. In the decoder, datafrom blocks which have already been decoded and reconstructed can beused to provide a spatial prediction for the current block. In theencoder, this intra-prediction is of course made available through thelocal decoder.

This additional spatial prediction has been found to increaseperformance significantly, particularly for edge detail and for stronglydirectional texture, such as diagonal stripes.

Experiments have however shown that the increase in performance isgreatest at small block sizes and that performance declines as blocksizes increase. This is a problem first because transform coding gain isrelatively poor for small block sizes and efficient transform codingdemands large block sizes. Second, moves to higher definitions willnecessarily involve still higher block sizes. To give some examples,intra-predictions have been found to work well with 4×4 block sizes.Moving to 8×8 blocks might give around 1 dB in transform coding gain,but spatial prediction becomes more complex and less effective. At HDresolutions and above, 16×16 blocks or larger transforms will be needed,(perhaps up to 64×64 for UHDTV).

A similar tension exists in motion-compensated prediction: larger blocksize requires that fewer motion vectors are encoded, and allows the useof larger transforms on the residual. However, it increases thelikelihood that some part of the large block will be poorly predicted,perhaps because of the motion of some small object or part of an objectwithin the block area.

SUMMARY

The present invention addresses this tension between the small blocksize required for effective prediction and the large block size required(especially at increased definition) for effective transform codinggain.

Accordingly, the present invention consists in one aspect in a method ofcompression coding, comprising the steps of forming a difference betweeninput picture values and picture prediction values; and transforming thedifference; wherein a picture prediction is formed by the combination ofa predictor PC which is restricted to prediction values capable of exactreconstruction in a downstream decoder and a spatial predictor PO whichis not restricted to prediction values capable of exact reconstructionin a downstream decoder.

The spatial predictor PO may access pixels within the current block, soenabling effective intra-picture spatial prediction within blocks thatare large enough for efficient transform coding.

The combination of the spatial predictor PO with a predictor PC which isrestricted to prediction values capable of exact reconstruction in adownstream decoder may enable noise growth to be controlled. Thecombination may comprise a weighted sum of the respective outputs of thepredictor PC and the predictor PO, such as

P=(1−c)PC+cPO

where c is a weighting factor variable between zero and unity.

Whilst c may be chosen so as to optimise control of noise growth andaccuracy of prediction, the overall prediction does not in this examplechange in weight.

The factor c or more generally the relative weighting of the predictorPC and the predictor PO, may vary with picture content.

Often, the predictor PC will be a spatial predictor, such as—forexample—the described H.264 spatial predictor. In other arrangements,the predictor PC is a temporal predictor such as the well known motioncompensated predictor in MPEG standards.

In another aspect, the present invention consists in a method ofcompression decoding a bitstream encoded as outlined above, comprisingthe steps of receiving a bitstream representing picture differences;exactly reconstructing the prediction values of the predictor PC;inexactly reconstructing the prediction values of the predictor PO; andusing a combination of the reconstructed prediction values for summationwith the picture differences.

In yet a further aspect, the present invention consists in a videocompression encoder comprising: a block splitter receiving pictureinformation and splitting the picture information into spatial blocks; ablock predictor operating on a block to provide block prediction valuesfor the block; subtractor means receiving picture information andprediction values and forming difference values; a block transformconducting a spatial transform to provide transform coefficients; aquantisation unit for producing approximations to the transformcoefficients; an entropy coding unit for encoding transform coefficientsinto a coded bitstream; an inverse quantisation unit for reconstructingtransform coefficients; an inverse block transform conducting an inversespatial transform on the transform coefficients to provide locallydecoded picture values; and a local decoder predictor operating on thelocally decoded picture values to provide local decoder predictionvalues, wherein the prediction values received by said subtractor meanscomprise a combination of the block prediction values and the localdecoder prediction values.

In this aspect, the present invention also consist in a videocompression decoder comprising: an input receiving a compression encodedbitstream representing transformed picture differences organised inblocks; an inverse quantisation unit providing re-scaled transformcoefficients; an inverse block transform conducting an inverse spatialtransform on the transform coefficients to provide decoded picturevalues; and a predictor operating on the decoded picture values toprovide prediction values for summation with said picture differences,wherein the predictor comprises a closed predictor operating whollyoutside a particular block to provide closed prediction values forsummation with picture differences in that block and an open predictoroperating at least partly inside a particular block to provide openprediction values for summation with picture differences in that block;the prediction values comprising a combination of the closed predictionvalues and the open prediction values.

In still a further aspect, the present invention consists in a videocompression system comprising in an encoder: block splitter means forreceiving picture information and splitting the picture information intospatial blocks; block predictor means for operating on a block toprovide block prediction values for the block; subtractor means forreceiving picture information and prediction values and forming picturedifference values; block transform means for conducting a spatialtransform on the picture difference values to provide transformcoefficients; quantisation means for approximating transformcoefficients; inverse quantisation means for reconstructing transformcoefficients; inverse block transform means for conducting an inversespatial transform on the transform coefficients to provide locallydecoded picture values; local decoder predictor means for operating onthe locally decoded picture values to provide local decoder predictionvalues, wherein the prediction values received by said subtractor meanscomprise a variable combination of the block prediction values and thelocal decoder prediction values; and means for outputting a compressionencoded bitstream representing the quantised transform coefficients andincluding a parameter recording variation of said combination, andfurther comprising in a decoder: receiving said compression encodedbitstream: inverse block transform means for conducting an inversespatial transform on the transform coefficients to provide decodedpicture values; inverse quantisation means for reconstructing transformcoefficients; and predictor means operating on the decoded picturevalues to provide prediction values for summation with said picturedifferences, wherein the predictor means comprises a closed predictoroperating wholly outside a particular block to provide closed predictionvalues for summation with picture differences in that block and an openpredictor operating at least partly inside a particular block to provideopen prediction values for summation with picture differences in thatblock; the prediction values comprising a varying combination of theclosed prediction values and the open prediction values, the combinationbeing varied by the prediction means in accordance with said parameterin the bitstream.

Suitably, wherein the picture prediction values P are formed as:

P=(1−c)PC+cPO

where PO is the block prediction values; PC is local decoder predictionvalues; and c is a weighting factor variable between zero and unity.

In another aspect the present invention consists in a method ofcompression coding, comprising the steps of forming a difference betweeninput picture values and picture prediction values; and transforming thedifference; wherein a picture prediction is formed by the combination ofa closed loop predictor (CLP) which is restricted to prediction valuescapable of exact reconstruction in a downstream decoder and an open looppredictor (OLP) which is not restricted to prediction values capable ofexact reconstruction in a downstream decoder, wherein the open looppredictor and the transform are in the same temporal or spatial domain.

Suitably, the combination comprises a weighted sum of the respectiveoutputs of the CLP and the OLP in which the relative weighting of theCLP and the OLP may vary with picture content. The picture prediction Pis formed as:

P=(1−c)Pc+cPo

where c is a weighting factor variable between zero and unity, Pc is theprediction value of the CLP and Po is the prediction value of the OLP.

In one variation, the CLP is a spatial predictor and the OLP is aspatial predictor. The CLP may predict a block from neighbouring,previously coded blocks in the same picture. The OLP may be a pixelwisespatial predictor taking the mean or other combination of adjacentpixels in the same transform block. The spatial transform may beselected form the group consisting of a block transform; a discretecosine transform (DCT); a discrete sine transform (DST); a wavelettransform; a blocked wavelet transform; a Lapped Orthogonal Transform(LOT); a blocked LOT; or approximations to any of the preceding. Thespatial predictions may be performed after motion-compensated predictioni.e. on a motion compensated prediction residue.

In another variation, the CLP is a motion-compensated prediction (orcombination of motion compensated predictions) from previously-codedpictures and the OLP is a spatial predictor. The CLP may be ablock-based motion compensated prediction. The OLP may be a pixelwisespatial predictor taking the mean or other combination mean of adjacentpixels in the same transform block. The spatial transform may beselected form the group consisting of a block transform; a discretecosine transform (DCT); a discrete sine transform (DST); a wavelettransform; a blocked wavelet transform; a Lapped Orthogonal Transform(LOT); a blocked LOT; or approximations to any of the preceding.

In still another variation, the CLP is a spatial predictor and the OLPis a motion-compensated prediction (or combination of motion compensatedpredictions) from previously-coded pictures. The OLP may be ablock-based motion compensated prediction. The CLP may be a spatialpredictor from previously-coded blocks in the same picture. The temporaltransform is selected form the group consisting of a block transform; adiscrete cosine transform (DCT); a discrete sine transform (DST); awavelet transform; a blocked wavelet transform; a Lapped OrthogonalTransform (LOT); a blocked LOT; or approximations to any of thepreceding.

In yet a further variation, the CLP and the OLP are motion-compensatedpredictions (or combinations of motion compensated predictions) frompreviously-coded pictures. The OLP may be a block-based motioncompensated prediction. The CLP may be a block-based motion compensatedprediction. The temporal transform may be selected form the groupconsisting of a block transform; a discrete cosine transform (DCT); adiscrete sine transform (DST); a wavelet transform; a blocked wavelettransform; a Lapped Orthogonal Transform (LOT); a blocked LOT; orapproximations to any of the preceding.

A combination factor c may vary, for example, block by block, or frameby frame or group of pictures (GOP) by GOP. The combination factor mayvary within a transform block according to some pre-determined patternor patterns. The combination factor may be contained in additionalmeta-data and conveyed alongside the coded data. The chosen pattern maybe encoded by means of an index or flag conveyed alongside the codeddata.

The gain of the two predictors may sum to unity.

In another aspect, the present invention consists in a method ofcompression decoding a bitstream encoded in accordance with any one ofthe preceding claims, comprising the steps of receiving a bitstreamrepresenting picture differences; exactly reconstructing the predictionvalues of the predictor Pc; inexactly reconstructing the predictionvalues of the predictor Po; and using a combination of the reconstructedprediction values for summation with the picture differences.

Consider a set of frames, say 4 frames FN, FN+1, FN+2, FN+3 where allframes prior to FN have been coded.

For each block in Fk, (k=N, N+1, N+2, N+3) one could have two motionvectors, one representing a closed loop predictor from frames prior toFN, and the other representing an open loop predictor from frames withinthe set—for example, just from the previous frame. Both sets of motionvectors would be transmitted.

The frames could be motion compensated using reconstructed data wherepossible and original data where not, each block having one motionvector of each sort and using a mixed prediction.

Then a temporal transform could be applied to the 4 frames, for examplea 4-point DCT or Hadamard transform, additionally to any spatialtransform applied in the blocks. The block coefficients would bequantised and coded.

At the decoder all four frames would be decoded at once, since they weretransformed together. After inverse transform (spatial and temporal)this gives 4 residue frames. Using the motion vectors provided, motioncompensation can be applied using reconstructed data. If the open loopprediction is always from the previous frame, then reconstructed data isalways available if the frames are processed in order.

Note that although the terms “previous” and “later” have been used Theorder of the frames need not be true temporal order in anapplication—there may have been some temporal reordering to allow forlater references as well as earlier ones to be available. So these termscan also mean earlier or later in coding order.

Other aspects of the invention will become apparent by consideration ofthe detailed description and accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will now be described by way of example with reference tothe accompanying drawings, in which:

FIG. 1 is a diagram illustrating a known spatial prediction technique;

FIG. 2 is a block diagram of an encoder:

FIG. 3 is a diagram illustrating spatial prediction:

FIG. 4 is a block diagram of an encoder; and

FIGS. 5 and 6 are block diagrams illustrating multi-pass encodingtechniques.

DETAILED DESCRIPTION

As has been mentioned MPEG Part 10/AVC/H.264 (from here referred to asH.264 for convenience) contains an addition to previous MPEG standards,which is the provision for intra prediction of blocks. Data along thetop and to the left of a block, which has been decoded and reconstructedalready, can be used to provide a prediction for the current block whichcan now be coded differentially. FIG. 1 shows the 8 possible directionalpredictions that can be used for 4×4 blocks (previously reconstructedsamples are shaded). In addition to these directional predictions, theDC can be predicted from the mean of pixels at the edges of the block,giving 9 modes in all. Other predictions are available for 16×16 and 8×8blocks.

The H264 intra prediction tool showed the advantage of combining spatialpredictive coding with transform coding. It was particularly effectivefor edge detail and for strongly directional texture, such as diagonalstripes.

There is however one problem in the inherent tension between the smallblocks required to produce a good prediction, and the large blocksrequired to get good transform coding gain.

The decrease in efficiency in spatial prediction coding with increasingblock size can be understood to arise from the increasing distance (atleast towards the end of a raster scanned block) between the currentpixel and the pixels upon which the prediction is based. As thatdistance increases, correlation between pixels reduces and so also doesdifferential coding efficiency.

Likewise, motion compensated prediction has formed an essential part ofvideo compression standards since MPEG-1. A key issue is the trade-offbetween block size and prediction accuracy. Larger block sizes requirefewer motion vectors to be coded, but have less prediction accuracy,since small objects or parts of objects in a large block may movedifferentially with respect to the rest of the block. Transform blocksizes are consequentially constrained since block transforms aregenerally applied wholly within a prediction block, to avoid thedifficulties of transforming the transitions from one block to another.In H264, these trade-offs may be achieved by selecting from among a verywide variety of motion block partitions.

What would be desirable is to predict at a finer grain than thetransform may allow. However, within a block, the samples available tothe decoder are those that have been decoded and reconstructed. Thesamples used in the encoder are the original samples, which aredifferent due to subsequent quantisation. The prediction would in thissense be open loop, in contrast to the closed loop that is providedwhere a local decoder within the encoder guarantees that the predictionvalues used by the encoder can be exactly reconstructed in the decoder.This difference could cause significant noise growth

To see how the reconstruction noise can grow, let P(x0, . . . , xr-1)denote the prediction of sample xr from samples xk, k=0, . . . , r−1.Then the prediction residues yr are given by:

yr=xr −P(xo, . . . , xr−1)

Let L=T−1Q−1QT denote the process of transforming, quantising, inversequantising and inverse transforming the sequence yr. We can assume forthe moment that the effect of L is to add a noise source nr of varianceσ^(n) ² to yr, i.e.

Yr=L(yr)=yr+nr

In reconstructing, the decoder will form

Xr=Yr+P(X0, . . . , Xr−1)

In addition to the noise nr on Yr, the prediction will differ because ofthe noise on each of the previous reconstructed values X, and this noisecan therefore grow. In particular, since a good predictor P willtypically have unity gain at DC, meaning that 1−P has a zero and theinverse filter will have a pole i.e. infinite gain at DC. Thus noise cangrow in an unbounded fashion. A closed-loop predictor, in which thepredictor uses the reconstructed values X0 at the encoder also, will nothave this problem.

There is thus a problem that a closed-loop predictor in conjunction witha transform encoder is limited either in the accuracy of the prediction(if the block is large) or in the efficiency of the transform (if theblock is small); and an open-loop predictor can suffer from unboundednoise gain.

A solution to this problem will now be described.

Mathematically, define PC and PO to be two predictors. At the encoder PCwill be applied closed-loop, to produce a prediction constructed solelyfrom previously coded and reconstructed coefficients; PO will be appliedopen-loop, that is it will be applied to produce predictions fromoriginal uncoded coefficients. Of course, at the decoder bothpredictions must use reconstructed coefficients.

An example of PC would be to predict pixels in a block by the mean ofpixels in neighbouring blocks. An example of PO would be to predict apixel from immediately neighbouring pixels, whether they fall in thesame block or not. Then a new combined predictor P can be created by,

P=(1−c)PC+c PO

In this case, a factor c applies to PO. If this factor c is between zeroand unity it acts as a damping or leakiness factor on the noisecontributed by the open-loop predictor in the decoder, and this controlsnoise growth. Yet the combined predictor remains an excellent predictordue to the complementary contribution of the closed-loop predictor,whereas without a complementary closed-loop predictor the efficacy ofthe prediction would fall as the factor c got smaller. In particular, ifboth predictors eliminate DC, then so will the combined predictor.

An example of an encoder architecture is shown in FIG. 2. Input video isreceived at block splitter B which serves to split the input data intoblocks. The block information is supplied to a block store BS and to asubtractor (200). Block store BS provides data to an open loop predictorPO. This predictor operates within the block to provide a predictionvalue for the current pixel. The predicted value takes the form, in oneexample, of the arithmetic mean of the three pixels, respectively:horizontally to the left, diagonally to the top left and verticallyabove the current pixel. It will be understood that a predicted valuecan be formed in a variety of other ways, utilising further or differentneighbouring pixels and applying different weightings to differentneighbouring pixels.

The predicted value from PO is supplied as the negative input tosubtractor (200) via a multiplier (204) which applies a gain factor c.The parameter c may typically vary between zero and unity. This gaincontrol, when c is less than unity, addresses the problem of noisegrowth that would otherwise be associated with the use of an open looppredictor.

The output from the subtractor (200) passes through a second subtractor(204) to a conventional DCT or other spatial transform block T. Thetransform T operates on the received picture difference to providetransform coefficients to a generally conventional quantiser block Q.The quantised transform coefficients undergo variable length or otherentropy coding in block EC to provide the output bitstream of theencoder.

To provide the closed loop prediction, a locally decoded block storeLDBS is maintained containing values that have been reconstructed justas they would be in the decoder, using only previously reconstructeddata values. The closed loop spatial prediction may conveniently takethe form of the known H.264 spatial predictor or motion compensatedpredictors.

The closed loop prediction values are passed as a negative input tosubtractor via multiplier applying a gain control factor of (1−c).

Values from the LDBS are constructed by means of an inverse quantiserblock Q-1, an inverse transform block T-1, and the addition ofpredictions in the same proportions but using only previously-decodedreconstructed values from the LDBS itself.

Note that in block operation, the feed-forward predictor PO can also usereconstructed samples where they are available, at the top and left of ablock if we assume blocks are scanned in raster order. Thus the samplesinput to PC are also input to PO. This means that wherever possible thepredictions use reconstructed samples.

Thus, referring to FIG. 3, the pixels in the top row and the left columnof the current 4×4 block (marked X in the figure) can be predictedentirely or mostly from locally decoded pixels (outside the currentblock) which are shown cross hatched in the figure. The result should beto restrict the growth of noise still further by synchronising theprediction in encoder and decoder each block.

Since there will be little or no divergence between encoder and decoderon the top and left of the block, a lower degree of leakiness may berequired there, thus allowing for better prediction to be used in theseareas. In other words, the parameter c may—in addition to any variationin accordance with picture content—may vary with the position of thecurrent pixel in the block. This variation with pixel position need notnecessarily be signalled in the bitstream; it may for example form partof an industry or de facto standard.

In this approach, the coefficient order for the prediction anddifference generation will not be a raster order but will beblock-by-block, scanning coefficients within each block before moving onto the next one. Raster scanning within a block could be used, but otherscan patterns may give a better result, for example scanning inconcentric ‘L’ shapes as shown in FIG. 3.

Since the total prediction now has no weighting factor, thisarchitecture allows for varying the degree of leakiness across a blockwithout introducing spurious frequency components.

If c is fixed, a useful value has been found to be around 0.5.

The parameter c may be optimised for different picture content dependingon the block sizes used, the quantisation parameters selected and therelative success of the two predictions. Small block sizes and lowlevels of quantisation will in general produce less divergence betweenthe encoder and decoder open loop predictions, and so the overall valueof c could be adjusted closer to unity in these circumstances, eitherthrough encoding in the bitstream or according to the governingstandard.

This system is particularly attractive since it can be combined easilywith a whole range of different predictors. For example, PC could be adirectional predictor already defined in H264, and PO could be adirectional pixelwise predictor. Alternatively PC could be a motioncompensated temporal predictor.

Or, for wavelet coding, one could do a form of hierarchical coding wherethe low-pass coefficients provide a closed-loop prediction at eachlevel.

In a further variation, a fixed number of possible values of c could bepre-determined, with an encoder able to choose the best value to use fora particular block or set of blocks, or to choose only to use theclosed-loop predictor. Meta-data could be sent accompanying thetransform coefficients of each block or set of blocks to indicatewhether an open-loop prediction has been used, and which value of c hasbeen employed.

For example, 4 possible non-zero values of c may be used, perhaps ¼, ½,¾ and 1. Values of 15/32, 25/32, 10/32 and 22/32 have been shown to workwell. An encoder would select an optimum value to use, normally by somerate-distortion optimisation method.

In a yet further variation, the value c may be fixed but the open-loopprediction may vary among a number of possible predictors. Meta-datacould be sent accompanying the transform coefficients of each block orset of blocks to indicate whether an open-loop prediction has been usedand if so which has been used.

For example, 4 different directional spatial predictors may be used: ahorizontal predictor, a vertical predictor, and two diagonal predictorsat 45 degree angles to these vertical and horizontal predictors.

Meta-data for configuring the prediction methods for an individual blockor a set of blocks may be encoded by well-known methods for encodingconfigurable options in existing video standards. For example, anencoder may first encode a flag indicating the presence or absence of anopen-loop predictor. If an open-loop predictor is present, the optionselected could be encoded in a number of bits. A typical scheme wouldallow 2N options, encoded in N bits, as in the following pseudocode forcase N=2:

EncodeBit(using_open_loop); if (1==using_open_loop){   EncodeBit(combined_pred_mode & 0x01);   EncodeBit((combined_pred_mode & 0x02)>>1); }

Alternatively, there may be some correlation between the metadata of oneblock and that of previously coded blocks. In that case an encoder mayfollow a scheme similar to that used for coding intra prediction modesin H.264. It may consider the case where the open loop in not used as anadditional prediction mode, making 2N+1 options. A flag is then codedindicating whether the prediction is used. If it is not, then theremaining 2N modes can be coded using N bits as illustrated in thefollowing pseudocode for case N=2:

predicted_mode = get_mode_prediction( );EncodeBit(combined_pred_mode==predicted_mode); if(combined_pred_mode<predicted_mode){    EncodeBit(combined_pred_mode &0x01);    EncodeBit((combined_pred_mode & 0x02)>>1); } else if(combined_pred_mode>predicted_mode){    combined_pred_mode =combined_pred_mode−1;    EncodeBit(combined_pred_mode & 0x01);   EncodeBit((combined_pred_mode & 0x02)>>1); }

A decoder architecture is shown in FIG. 4. In the decoder, the bitstreamis received by an entropy decoding block ED and passes through aninverse quantiser block Q-1 and an inverse transform block T-1. Theoutput of the inverse transform block is passed to the input of adecoded block store DBS. Decoded data from the decoded block store ispassed to the inputs of a closed loop predictor PC and an open loopprotector PO. The output of PO is passed to an adder (402) through amultiplier (404) applying a gain control factor c. The output ofpredictor PC is applied to an adder (406) through a multiplier (408)applying gain control factor (1−c). Both the open-loop predictor PO andthe gain control factor c may be selectable based on meta-datatransmitted by the encoder. The two adders serve to add the weightedprediction outputs to the values output from the inverse transformblock. Once reconstructed, the values are passed into the DBS for use inpredicting subsequent values.

The input to the DBS of course also provides the video output from thedecoder.

Optimum predictors can be selected of by adaptive means. Predictors maybe chosen, for example, by linear optimisation techniques, minimisingmean square error, or by determining and extrapolating local gradients.Whatever the method, a fundamental distinction is between continuousadaptive techniques, whereby the choice of predictor is a continuousfunction of values in a neighbourhood of the current pixel, anddiscontinuous techniques in which the predictor is switched.

Operating open-loop, in any adaptive technique the adaptive predictoritself could differ between encoder and decoder. Discontinuous adaptivetechniques would appear especially dangerous, since very differentpredictors could be chosen. In a continuous system, given similarvalues, similar predictors would be chosen.

As an example of continuous adaption, it can be shown that if pixels arescanned to produce a sequence x(n) with autocorrelation R(k), then theMMSE predictor

${P\left( {x,n} \right)} = {\sum\limits_{k = 1}^{N}\; {a_{r}{x\left( {n - r} \right)}}}$

has coefficients ak which satisfy the system of TV linear equations

${{R(j)} - {\sum\limits_{k = 1}^{N}\; {a_{k}{R\left( {j - k} \right)}}}},{j = 1},\ldots,N$

Therefore an adaptive system can be obtained by taking a rollingsnapshot of the signal and solving this system. A more tractableadaption method which would approximate this (and converge to it, givenstationary statistics) would be to use the LMS or RLS algorithms.

In this case both the basic samples and the autocorrelation functionswould be different between encoder and decoder, causing differentfilters to be used. This might well not be significant, however, if theprediction could be much better. The adaption could be made more stableby assuming a degree of white noise, for example by adding a small deltaimpulse to the measured autocorrelation R(k), or by directly addingartificial noise to the feedback signal in the LMS/RLS algorithm.

The architectures described above involve predictions using theoriginal, uncoded samples. It is this that causes the noise additionfrom the prediction process. However, in a compression system an encoderis able to use any samples it likes to generate a bit stream: it is onlydecoder processes that need to be specified. So the encoder may modifythe samples used for prediction in order that they are closer to thesamples that the decoder uses for reconstruction. With open-looppredictors, it cannot be guaranteed that the samples are identical, butmultiple passes should provide some degree of convergence.

The way to do this is to concatenate two (or more) encoders, so that theprediction utilises data that has been coded and locally decoded by afirst encoder. In this case an element of feedback has been reintroducedinto the encoding process. Block diagram are shown in FIGS. 5 and 6.Here a first encoder performs the initial coding in the manner describedabove. A decoder again as described above then produces a first-passdecoded signal, which is then passed to the second encoder. Of course,whilst the first and second encoders are drawn separately, they willtypically constitute first and second passes of one hardware or softwareencoder.

Two basic variants may be considered. In the first, shown in FIG. 5,only the predictor of the second encoder uses the locally decodedversion, but the pixels being predicted remain the original ones (with acompensating delay to account for the first-pass encode and decode). Inthe second, shown in FIG. 6, both prediction and predicted pixels woulduse a locally decoded version. Any number of these stages may beconcatenated in order to achieve greater convergence between encoder anddecoder prediction coding processes.

It will be understood that this invention has been described only by wayof example and that a wide variety of modifications are possible withoutdeparting from the scope of the appended claims. To the extent thatdescribed examples include separate features and options, allpracticable combinations of such features and options are to be regardedas disclosed herein. Specifically, the subject matter of any one of theclaims appended hereto is to be regarded as disclosed in combinationwith the subject matter of every other claim.

What is claimed is:
 1. A method of compression coding, the methodcomprising the steps of: forming a difference between input picturevalues and picture prediction values; and transforming the difference ina transform; wherein a picture prediction P is formed by the combinationof a predictor P^(c) which is a temporal predictor or a spatialpredictor and which is restricted to prediction values capable of exactreconstruction in a downstream decoder and a predictor P^(o) which isnot restricted to prediction values capable of exact reconstruction in adownstream decoder, the predictor P^(o) being a spatial predictor withthe transform being a spatial transform or a temporal transform or thepredictor P^(o) being a temporal predictor with the transform being atemporal transform.
 2. The method according to claim 1, wherein saidpicture prediction P comprises a weighted sum of the respective outputsof the predictor P^(c) and the predictor P^(o) of the form:P=aP ^(c) +bP ^(o)
 3. The method according to claim 1, wherein therelative weighting of the predictor P^(c) and the predictor P^(o) varieswith picture content.
 4. The method according to claim 1, wherein thepicture prediction P is formed as:P=(1−c)P ^(c) +cP ^(o) where c is a selectable weighting factor variablebetween zero and unity.
 5. The method according to claim 4, whereinmetadata indicating the weighting factor c is signalled in a bitstream.6. The method according to claim 1 in which the prediction P^(o) or theprediction P^(c) is selectable and wherein metadata indicating theselectable predictions is signalled in a bitstream.
 7. The methodaccording to claim 1, wherein the prediction P^(o) is a directionalpredictor selectable from a set of directional predictors.
 8. The methodaccording to claim 1 wherein the difference is transformed in a spatialblock transform.
 9. A method of compression decoding a bitstream encodedin accordance with claim 1, comprising the steps of receiving abitstream representing picture differences; exactly reconstructing theprediction values of the predictor P^(c); inexactly reconstructing theprediction values of the predictor P^(o); and using a combination of thereconstructed prediction values for summation with the picturedifferences.
 10. The method according to claim 9, wherein the manner ofcombination of the reconstructed prediction values is varied under thecontrol of a parameter represented in the bitstream.
 11. The methodaccording to claim 1, comprising: in a first step: forming a differencebetween input picture values and picture prediction values; andtransforming the difference; wherein a picture prediction is formed bythe combination of a predictor P^(c) which is restricted to predictionvalues capable of exact reconstruction in a downstream decoder and aspatial predictor P^(o) which is not restricted to prediction valuescapable of exact reconstruction in a downstream decoder; in a secondstep: receiving a bitstream from said first encoding; and inexactlyreconstructing the prediction values of the predictor P^(o); and in athird step: forming a difference between said input picture values andthe inexactly reconstructed picture prediction values from said secondstep; and transforming the difference.
 12. A method of compressioncoding, the method comprising the steps of: forming a difference betweeninput picture values and picture prediction values; and transforming thedifference; wherein a picture prediction is formed by the combination ofa closed loop predictor (CLP) which is restricted to prediction valuescapable of exact reconstruction in a downstream decoder and an open looppredictor (OLP) which is not restricted to prediction values capable ofexact reconstruction in a downstream decoder, wherein the open looppredictor and the transform are in the same temporal or spatial domain.13. The method according to claim 12, wherein said combination comprisesa weighted sum of the respective outputs of the CLP and the OLP, withweighting factors that sum to unity.
 14. The method according to claim12, wherein the relative weighting of the CLP and the OLP varies withpicture content.
 15. The method according to claim 12 in which the CLPis a spatial predictor and the OLP is a spatial predictor.
 16. Themethod according to claim 15 in which the CLP predicts a block fromneighbouring, previously coded blocks in the same picture.
 17. Themethod according to claim 12, in which the transform is selected formthe group consisting of a block transform; a discrete cosine transform(DCT); a discrete sine transform (DST); a wavelet transform; a blockedwavelet transform; a Lapped Orthogonal Transform (LOT); a blocked LOT;or approximations to any of the preceding.
 18. The method according toclaim 15, in which spatial predictions are performed aftermotion-compensated prediction.
 19. The method according to claim 12, inwhich the CLP is a block-based motion compensated prediction (orcombination of motion compensated predictions) from previously codedpictures and the OLP is a spatial predictor.
 20. The method according toclaim 15 in which the OLP is a pixelwise spatial predictor taking themean or other combination mean of adjacent pixels in the same transformblock.
 21. The method according to claim 12, in which the CLP is aspatial predictor and the OLP is a motion-compensated prediction frompreviously-coded pictures.
 22. The method according to claim 21, inwhich the CLP is a spatial predictor from previously-coded blocks in thesame picture.
 23. The method according to claim 12 in which the CLP andthe OLP are motion-compensated predictions from previously-codedpictures.
 24. The method according to claim 12 in which the transform isselected form the group consisting of a block transform; a discretecosine transform (DCT); a discrete sine transform (DST); a wavelettransform; a blocked wavelet transform; a Lapped Orthogonal Transform(LOT); a blocked LOT; or approximations to any of the preceding.
 25. Themethod according to claims 12, in which the weighting factors vary blockby block, or frame by frame or group of pictures (GOP) by GOP.
 26. Themethod according to claim 12, in which the weighting factors vary withina transform block according to a pre-determined pattern encoded by meansof an index or flag conveyed alongside the coded data.
 27. The methodaccording to claim 2, in which a and b sum to unity.
 28. Anon-transitory computer program product comprising instructions causingprogrammable apparatus to perform a method comprising: in a first step:forming a difference between input picture values and picture predictionvalues; and transforming the difference; wherein a picture prediction isformed by the combination of a predictor P^(c) which is restricted toprediction values capable of exact reconstruction in a downstreamdecoder and a spatial predictor P^(o) which is not restricted toprediction values capable of exact reconstruction in a downstreamdecoder; in a second step: receiving a bitstream from said firstencoding; and inexactly reconstructing the prediction values of thepredictor P^(o); and in a third step: forming a difference between saidinput picture values and the inexactly reconstructed picture predictionvalues from said second step; and transforming the difference.
 29. Avideo compression encoder comprising: a block splitter receiving pictureinformation and splitting the picture information into spatial blocks; ablock predictor operating on a block to provide block prediction valuesfor the block; a subtractor receiving picture information and predictionvalues and forming difference values; a block transform conducting aspatial transform to provide transform coefficients; a quantisation unitfor producing approximations to the transform coefficients; an entropycoding unit for encoding transform coefficients into a coded bitstream;an inverse quantisation unit for reconstructing transform coefficients;an inverse block transform conducting an inverse spatial transform onthe transform coefficients to provide locally decoded picture values;and a local decoder predictor operating on the locally decoded picturevalues to provide local decoder prediction values, wherein theprediction values received by said subtractor means comprise acombination of the block prediction values and the local decoderprediction values.
 30. The encoder according to claim 29, wherein saidcombination comprises the weighted sum of the block prediction valuesand the local decoder prediction values.
 31. The encoder according toclaim 29, wherein the relative weighting of the block prediction valuesand the local decoder prediction values varies with picture content. 32.The encoder according to claim 29, wherein the picture prediction P isformed as:P=(1×c)P ^(c) +cP ^(o) where P^(o) is the block prediction values; P^(c)is local decoder prediction values; and c is a weighting factor variablebetween zero and unity.